Phonological Parsing for Bi-directional Letter-to-Sound/Sound-to-Letter Generation

نویسندگان

  • Helen M. Meng
  • Stephanie Seneff
  • Victor Zue
چکیده

In this paper, we describe a reversible letter-to-sound/soundto-letter generation system based on an approach which combines a rule-based formalism with data-driven techniques. We adopt a probabilistic parsing strategy to provide a hierarchical lexical analysis of a word, including information such as morphology, stress, syllabification, phonemics and graphemics. Long-distance constraints are propagated by enforcing local constraints throughout the hierarchy. Our training and testing corpora are derived from the high-frequency portion of the Brown Corpus (10,000 words), augmented with markers indicating stress and word morphology. We evaluated our performance based on an unseen test set. The percentage of nonparsable words for letter-to-sound and sound-to-letter generation were 6% and 5% respectively. Of the remaining words our system achieved a word accuracy of 71.8~0 and a phoneme accuracy of 92.5% for letter-to-sound generation, and a word accuracy of 55.8% and letter accuracy of 89.4% for sound-to-letter generation. We also compared our hierarchical approach with an alternative, single-layer approach to demonstrate how the hierarchy provides a parsimonious description for English orthographic-phonological regularities, while simultaneously attaining competitive generation accuracy. I N T R O D U C T I O N This paper describes a trainable probabilistic system for reversible let ter-to-sound/sound-to-let ter generation. Sound-to-letter generation is a crucial aspect in the problem of automatic detection/incorporation of new words, which is in turn critical for the development of large vocabulary speech understanding systems. Moreover, letterto-sound generation will continue to be important for speech output , especially in applications such as reading machines. To successfully achieve our goal, several important issues must be addressed. First , what should be the inventory of linguistic or lexical units for describing English orthographic-phonological regularities? Second, how should these units be incorporated into the representation of English orthography and phonology? Third, what algorithms can be used to synthesize and analyze the spelling and pronunciation of an English word 1This research was supported by ARPA under Contract N0001489-J-1332, monitored through the Office of Naval Research, and a grant from Apple Computer Inc. in terms of these lexical units? These three issues will be addressed in detail in the following when we describe our approach and report on our system's performance for both letter-to-sound [1] and sound-to-letter generation [2]. The novel features of our approach include the reversibility of the combined parsing and generative processes, the abili ty to provide multiple output hypotheses, the capability of handling uncertainty in the input, as well as our t reatment of non-parsab!e words.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phonological Processing for Urdu Text to Speech System

Determining and modeling phonological phenomena is necessary to generate speech from textual input. These phenomena include letter to sound conversion, syllabification, sound change, stress assignment and intonation assignment. This paper presents work on Urdu phonological processes and provides algorithms to convert textual input into phonologically annotated output, required for Urdu text-to-...

متن کامل

Phonological awareness, letter-sound knowledge and word recognition in Greek deaf children

The aim of the study presented in this paper was to investigate the relation between phonological awareness and orthographic knowledge in deaf children who read in the transparent Greek orthography. Preschool and school-aged deaf children (N = 24) and two comparison groups of hearing children (N = 30) were administered measures of phonological awareness, letter-sound knowledge and word recognit...

متن کامل

Learning letter names and sounds: effects of instruction, letter type, and phonological processing skill.

Preschool-age children (N=58) were randomly assigned to receive instruction in letter names and sounds, letter sounds only, or numbers (control). Multilevel modeling was used to examine letter name and sound learning as a function of instructional condition and characteristics of both letters and children. Specifically, learning was examined in light of letter name structure, whether letter nam...

متن کامل

Spatial Attention Disorders in Developmental Dyslexia: Towards the Prevention of Reading Acquisition Deficits

Developmental dyslexia (DD) is a Ileurobiological disorder (see Habib, 2000; Demonet & Reilhac, 2012 in the present book for reviews ) characterized by difficulties in reading acquisition despite adequate intelligence, conventional education, and motivation (American Psychiatric Association, 1994). It is widely believed that impaired phonologi­ cal processing characterizes individuals with DD (...

متن کامل

The role of phonological awareness and letter-sound knowledge in the reading development of children with intellectual disabilities.

Our study investigated if phonological awareness and letter-sound knowledge were predictors of reading progress in children with intellectual disabilities (ID) with unspecified etiology. An academic achievement test was administered to 129 children with mild or moderate ID when they were 6-8 years old, as well as one and two school years later. Findings indicated that phonological awareness and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994